Pergamon 


Pattern  Recognition.  Vol.  21,  No.  9,  pp,  1275  1289.  1994 
Elsevier  Science  Ltd 
Copyright  ©  1994  Pattern  Recognition  Society 
Printed  in  Great  Britain.  Ail  rights  reserved 
0031  3203  94  S  7.00 +  .00 


0031-3203(94)E0026-H 

A  RELATIVE  ENTROPY-BASED  APPROACH  TO 
IMAGE  THRESHOLDING 

Chhin-I  Chang, t  Kebo  CHEN.f  Jianwei  WANct  and  Mark  L,  G.  Althouse  J 
t  Department  of  Electrical  Engineering,  University  of  Maryland,  Baltimore  County  Campus,  Baltimore. 

MD  21228-5398,  U.S.A. 

t  Edgewood  Research,  Development  and  Engineering  Center,  Aberdeen  Proving  Ground,  MD  21010-5432, 

U.S.A. 

{Received  4  Aui/ust  1993;  received  for  publication  24  February  1994) 


Abstract — In  this  paper,  we  present  a  new  image  thresholding  technique  which  uses  the  relative  entropy 
(also  known  as  the  Kullback-Leiber  discrimination  distance  function)  as  a  criterion  of  thresholding  an 
image.  As  a  result,  a  gray  level  minimizing  the  relative  entropy  will  be  the  desired  threshold.  The  propo,sed 
relative  entropy  approach  is  different  from  two  known  entropy-based  thresholding  techniques,  the  local 
entropy  and  joint  entropy  methods  developed  by  N.  R.  Pal  and  S.  K.  Pal  in  the  sense  that  the  former  is 
focused  on  the  matching  between  two  images  while  the  latter  only  emphasized  the  entropy  of  the  co¬ 
occurrence  matrix  of  one  image.  The  experimental  results  show  that  these  three  techniques  are  image 
dependent  and  the  local  entropy  and  relative  entropy  seem  to  perform  better  than  does  the  joint  entropy. 
In  addition,  the  relative  entropy  can  complement  the  local  entropy  and  joint  entropy  in  terms  of  providing 
different  details  which  the  others  cannot.  As  far  as  computing  saving  is  concerned,  the  relative  entropy 
approach  also  provides  the  least  computational  complexity. 

Thresholding  Relative  entropy  Local  entropy  Joint  entropy  Co-occurrence  matrix 


1.  INTRODUCTION 

Image  thresholding  often  represents  a  first  step  in 
image  understanding.  In  an  ideal  image  where  objects 
are  clearly  distinguishable  from  the  background,  the 
grey-level  histogram  of  the  image  is  generally  bimodal. 
In  this  case,  a  best  threshold  segmenting  objects  from 
the  background  is  one  placed  right  in  the  valley  of  two 
peaks  of  the  histogram.  However,  in  most  cases,  the 
grey-level  histograms  of  images  to  be  segmented  are 
always  multimodal.  Therefore,  finding  an  appropriate 
threshold  for  images  is  not  straightforward.  Various 
thresholding  techniques  have  been  proposed  to  resolve 
this  problem. 

In  recent  years,  information  theoretic  approaches 
based  on  Shannon’s  entropy  concept  have  received 
considerable  interest,""*’’  Of  particular  interest  are 
two  methods  proposed  by  N.  R.  Pal  and  S.  K.  Pal"’ 
which  use  a  co-occurrence  matrix  to  define  second- 
order  local  and  joint  entropies.  The  co-occurrence 
matrix  is  a  transition  matrix  generated  by  changes  in 
pixel  intensities.  For  any  two  arbitrary  grey  levels  i  and 
/  (/,  /  are  not  necessarily  distinct),  the  co-occurrence 
matrix  describes  all  intensity  transitions  from  grey 
level  i  to  grey  level  j.  Suppose  that  t  is  the  desired 
threshold.  The  t  then  segments  an  image  into  the 
background  which  contains  pixels  with  grey  levels 
below  or  equal  to  f  and  the  foreground  which  cor¬ 
responds  to  objects  having  pixels  with  grey  levels  above 
t.  This  t  further  divides  the  co-occurrence  matrix  into 
four  quadrants  which  correspond,  respectively,  to 


transitions  from  background  to  background  (BB), 
background  to  objects  (BO),  objects  to  background 
(OB)  and  objects  to  objects  (OO).  The  local  entropy  is 
defined  only  on  two  quadrants,  BB  and  OO,  whereas 
the  joint  entropy  is  defined  only  on  the  other  two 
quadrants,  BO  and  OB.  Based  on  these  two  definitions. 
Pal  and  Pal  developed  two  algorithms,  each  of  which 
maximizes  local  entropy  and  joint  entropy,  respectively. 

In  this  paper,  we  present  an  alternative  entropy- 
based  approach  which  is  different  from  those  in  refer¬ 
ences  (1-6).  Rather  than  looking  into  entropies  of 
background  or  object  individually,  we  introduce  the 
concept  of  the  relative  entropy'^*  (also  known  as  cross 
entropy,  Kullback-Leiber’s  discrimination  distance  and 
directed  divergence),  which  has  been  widely  used  in 
source  coding  for  the  purpose  of  measuring  the  mis¬ 
matching  between  two  sources.  Since  a  source  is  gene¬ 
rally  characterized  by  a  probability  distribution,  the 
relative  entropy  can  be  also  interpreted  as  a  distance 
measure  between  two  sources.  This  suggests  that  the 
relative  entropy  can  be  used  for  a  criterion  to  measure 
the  mismatching  between  an  image  and  a  thresholded 
bilevel  image.  One  method  to  apply  the  relative  entropy 
concept  to  image  thresholding  is  to  calculate  the  gray- 
level  transition  probability  distributions  of  the  co¬ 
occurrence  matrices  for  an  image  and  a  thresholded 
bilevel  image,  respectively,  then  find  a  threshold  which 
minimizes  the  discrepancy  between  these  two  transition 
probability  distributions,  i.e.  their  relative  entropy. 
The  threshold  rendering  the  smallest  relative  entropy 
will  be  selected  to  segment  the  image.  As  a  result,  the 
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thresholded  bilevel  image  will  be  the  best  approxima¬ 
tion  to  the  original  image.  Since  transitions  of  OB  and 
BO  generally  represent  edge  changes  in  boundaries 
and  transitions  of  BB  and  OO  indicate  local  changes  in 
regions,  we  can  anticipate  that  a  thresholded  bilevel 
image  produced  by  the  proposed  relative  entropy  ap¬ 
proach  will  best  match  the  co-occurrence  matrix  of  the 
original  image.  This  observation  is  demonstrated 
experimentally  by  several  test  images.  In  general,  the 
performance  of  all  the  three  methods  is  image  depen¬ 
dent.  Although  there  is  no  evidence  that  one  is  generally 
better  than  the  others,  according  to  the  experiments 
conducted  in  this  paper,  the  joint  entropy  did  not  work 
as  well  as  did  the  local  entropy  and  relative  entropy. 
Interestingly,  among  all  images  tested  the  relative 
entropy  approach  seems  to  be  better  than  the  others 
at  finding  edges.  In  addition,  our  experiments  show 
that  the  relative  entropy  seems  to  be  a  good  comple¬ 
ment  to  the  local  entropy  and  joint  entropy  methods 
in  terms  of  providing  different  image  details  and  des¬ 
criptions  from  those  provided  by  the  local  entropy 
and  joint  entropy.  Finally,  an  advantage  of  the  relative 
entropy  approach  is  the  computational  saving  based 
on  arithmetic  operations  required  for  calculating 
entropies  compared  to  the  local  and  joint  entropy 
approaches. 

This  paper  is  organized  as  follows.  Section  2  de¬ 
scribes  previous  work  on  entropy-based  thresholding 
approaches.  Section  3  introduces  the  concept  of  relative 
entropy  and  presents  a  relative  entropy-based  thres¬ 
holding  algorithm.  In  Section  4,  experiments  are  con¬ 
ducted  based  on  various  test  images  in  comparison  to 
the  local  entropy  and  joint  entropy  methods  described 
in  reference  (1).  Finally  a  brief  conclusion  is  given  in 
Section  5. 


2.  PRELIMINARIES 

2.1.  Co-occurrence  matrix 

Given  a  digitized  image  of  size  M  x  N  with  L 
gray  levels  G  =  {0, 1, 2, . . . ,  L— 1},  we  denote  F  = 
[/(x,y)]M„;v  to  represent  an  image,  where  f(x,y)€G 
is  the  grey  level  of  the  pixel  at  the  spatial  location  (x,  y). 
A  co-occurrence  matrix  of  an  image  is  an  L  x  L  dimen¬ 
sional  matrix,  T=  [t.jjtxL-  which  contains  information 
regarding  spatial  dependency  of  grey  levels  in  image 
F  as  well  as  the  information  about  the  number  of 
transitions  between  two  grey  levels  specified  in  a  parti¬ 
cular  way.  A  widely  used  co-occurrence  matrix  is  an 
asymmetric  matrix  which  only  considers  the  grey  level 
transitions  between  two  adjacent  pixels,  horizontally 
right  and  vertically  below.*'*  More  specifically,  let  ty 
be  the  {i,j)th  entry  of  the  co-occurrence  matrix  T. 
Following  the  definition  in  reference  (1), 

t,j=  X  X  (1) 

/= 1  *=1 


(f(l,k)  =  i,  /(/,k+l)=; 
S{l,k)=l,  and/or 

‘  [/(U)  =  i,  f(l+lk)=j 
5(1,  k)  =  0,  otherwise. 

One  may  like  to  make  the  co-occurrence  matrix 
symmetric  by  considering  horizontally  right  and  left, 
and  vertically  above  and  below  transitions.  It  has, 
however,  been  found*"  that  including  horizontally  left 
and  vertically  above  transitions  does  not  provide  more 
information  about  the  matrix  or  significant  improve¬ 
ment.  Therefore,  it  is  sufficient  to  consider  adjacent 
pixels  which  are  horizontally  right  and  vertically  below 
so  that  the  required  computation  can  be  reduced. 

Normalizing  the  total  number  of  transitions  in  the 
co-occurrence  matrix,  we  obtain  the  desired  transition 
probability  from  grey  level  i  to  ;*"  as  follows. 

Po  =  'o/(  Z/.v)  (2) 

2.2.  Quadrants  of  the  co-occurrence  matrix 

Let  leG  be  a  threshold  of  two  groups  (foreground 
and  background)  in  an  image.  The  co-occurrence 
matrix  T,  defined  by  (1),  partitions  the  matrix  into  four 
quadrants,  namely.  A,  B,  C,  and  D,  shown  in  Fig.  1. 

These  four  quadrants  may  be  separated  into  two 
types.  If  we  assume  that  pixels  with  grey  levels  above 
the  threshold  be  assigned  to  the  foreground  (objects), 
and  those  below,  assigned  to  the  background,  then,  the 
quadrants  A  and  C  correspond  to  local  transitions 
within  background  and  foreground,  respectively; 
whereas  quadrants  B  and  D  represent  transitions  across 
the  boundaries  of  background  and  foreground.  The 
probabilities  associated  with  each  quadrant  are  then 
defined  by 

t  t 

PA(t)=  I  Z  PiJ 

i  =  0.j  =  0 

PB{t)=  i  z  p,j 

1  =  0  /  =  r  +  I 
L-  1  L-  I 

Pcit)=  Z  Z  Pij 

i  =  / +  1  j  =  t+ I 

PD{t)=  Z  Zp.V  (3) 

i  =  I  +  1  j  =  0 

The  probabilities  in  each  quadrant  can  be  further 
defined  by  the  “cell  probabilities”  and  obtained  as 


0.  .t _ L;1 


A 

B 

D 

C 

where 


Fig.  I.  Quadrants  of  a  co-occurrence  matrix. 
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2.3.  Algorithms  of  Pal  and  Paf'  ^ 

The  algorithms  suggested  by  Pal  and  Pal"*  attempted 
to  take  advantage  of  spatial  correlation  in  an  image. 
By  doing  so,  they  introduced  two  concepts  of  second- 
order  entropy  based  on  Equations  (4)-(7),  which  are 
called  local  entropy  and  joint  entropy. 

Since  quadrant  A  and  quadrant  C  reflect  the  local 
transitions  from  background  to  background  (BB),  and 
object  to  object  (OO),  they  defined  local  entropy  of 
background  and  local  entropy  of  object  by  Hgit)  and 
respectively,  as  follows. 

ipfjlogpfj  (8) 

•^1  =  0  j-0 

V  V  Pologpf,.  (9) 

^  i  =  i+  I  j  =  f+ I 

It  should  be  noted  that  (8)  and  (9)  are  determined  by 
the  threshold  t,  thus  they  are  a  function  of  t. 

By  summing  up  the  local  entropies  of  the  object  and 
the  background,  the  second-order  local  entropy  can  be 
obtained  by 

wloii(t)  =  <V)  +  <'W.  (10) 

The  algorithm  proposed  by  Pal  and  Pal"*  is  one  to 
select  a  threshold  which  maximizes  the  //j^,,*,,  over  t.  In 
this  paper,  it  will  be  called  the  local  entropy-based 
algorithm. 

Alternatively,  quadrant  B  and  quadrant  D  provide 
edge  information  on  transitions  from  background  to 


object  (BO)  and  object  to  background  (OB).  In  analogy 
with  the  local  entropy  defined  above,  another  second- 
order  joint  entropy  of  the  background  and  the  ob¬ 
ject  was  also  defined  in  reference  (1)  and  given  as 
follows  by  averaging  the  entropy  H(B;0)  resulting 
from  quadrant  B,  and  the  entropy  H(0;  B)  from  quad¬ 
rant  D. 

Wjl,(t)  =  (f/(B;0)  +  //(0;B))/2 

=  -(z  I  PylogPo- 

\  /  =  0  ;  =  t  +  1 

+  L  Z  Py'"gPi/]/2.  (II) 

The  algorithm  maximizing  (11)  is  called  the  joint 
entropy-based  algorithm,  which  is  the  second  algorithm 
developed  by  Pal  and  Pal."* 


3.  RELATIVE  ENTROPY-BASED 
THRESHOLDING  TECHNIQUE 

3.1.  Definition  of  relative  entropy 

Let  S  be  an  /^symbol  source  and  pj  and  p'j  be  two 
probability  distributions  defined  on  S.  The  relative 
entropy  between  p  and  p'  (or  equivalently,  the  entropy 
of  p  relative  to  p')  is  defined  by 

L-  1 

L(P\P')=  I  Pjlog^'.  (12) 

J  =  0  Pj 

The  definition  given  by  Equation  (12)  was  first 
introduced  by  Kullback'^*  as  a  distance  measure  between 
two  probability  distributions,  and  later  was  found  to 
be  very  useful  in  many  applications.'*'*^*  Since  the 
information  contained  in  an  image  source  can  be  de¬ 
scribed  by  its  entropy,  which  in  turn  can  be  completely 
characterized  by  source  symbol  probabilities,  the  relative 
entropy  basically  provides  a  criterion  to  measure  the 
discrepancy  between  two  images  determined  by  prob¬ 
ability  distributions  pj  and  pj,  respectively.  The  smaller 
the  relative  entropy,  the  less  the  discrepancy.  It  is 
natural  to  use  relative  entropy  as  a  measure  of  difference 
between  an  image  and  its  segmented  image;  in  our  case, 
a  bilevel  thresholded  image.  There  are  several  synonyms 
of  relative  entropy,  e.g.  cross  entropy,  Kullback- 
Leiber’s  discrimination  distance  function  and  directed 
divergence. 

In  order  to  obtain  a  bilevel  image  of  good  quality, 
our  aim  is  to  find  a  threshold  to  segment  an  image  such 
that  the  resulting  thresholded  bilevel  image  will  best 
match  the  original  image.  Using  the  measure  of  relative 
entropy,  one  can  choose  the  threshold  t  in  such  a 
manner  that  the  grey-level  probability  distribution  pj 
of  the  thresholded  image  has  minimum  relative  entropy 
L(p;p')  with  respect  to  that  of  the  original  image,  p. 
More  specifically,  the  desired  threshold  t  minimizes  the 
discrepancy  between  p  and  p',  where  p  and  p'  are  the 
grey-level  probability  distributions  of  the  original 
image  and  the  resulting  thresholded  image,  respectively. 
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3.2.  Joint  relative  entropy-based  approach 

As  indicated  previously,  a  thresholding  method  based 
on  first-order  statistics  of  an  image  does  not  consider 
spatial  correlation  of  an  image.  Therefore,  exploiting 
the  spatial  dependency  of  the  pixel  values  in  the  image 
can  help  to  determine  a  good  threshold.  It  seems 
reasonable  to  extend  the  first-order  relative  entropy 
defined  in  the  previous  subsection  to  a  second-order 
joint  relative  entropy  between  p^j  and  p[j,  where  p.-j  and 
p’ij  are  the  transition  probability  distributions  of  the 
co-occurrence  matrices  defined  by  Equation  (1)  and  (2) 
generated  by  the  original  image  and  the  thresholded 
image,  respectively.  Since  transition  probability  distri¬ 
butions  defined  by  the  co-occurrence  matrix  contain 
the  spatial  information  which  reflects  homogeneity 
within  groups  (quadrants  A  and  C  in  Fig.  1),  and 
changes  across  boundaries  (quadrants  B  and  D  in 
Fig.  1),  one  can  envision  that  a  better  result  may  be 
obtained  if  we  choose  the  thresholded  bilevel  image  to 
be  the  one  which  has  the  best  transition  match  to  that 
of  the  original  image  in  terms  of  relative  entropy. 

Let  the  joint  relative  entropy  of  the  probability  dis¬ 
tributions  Pij  and  py  be  defined  by: 

UP',P')=  Z  Z  Pijlog^,  (13) 

i  =  0  y  =  0  P,j 

where  py  and  py  are  the  transition  probabilities  from 
grey  level  i  to  grey  level  j  of  the  original  image  and  the 
bilevel  image,  respectively.  Minimizing  L(p-,p')  over  t 
generally  renders  a  bilevel  image  which  best  matches 
the  original  image. 

It  should  be  noted  that  when  we  threshold  an  image, 
we  basically  assign  all  gray  levels  in  an  original  image 
to  either  0  or  1  which  corresponds  to  background  or 
objects.  As  a  result,  there  are  only  two  grey  levels  in 
the  thresholded  image.  The  subscript  ij  used  in  the  not¬ 
ation  of  the  transition  probability  py  still  refers  to  the 
grey  levels  of  the  original  image.  In  addition,  the  stat¬ 
istics  of  pixels  not  adjacent  to  one  another  could  also 
be  considered,  but  the  estimation  of  probabilities  for 
such  cases  would  be  very  difficult.  In  this  paper  we  only 
consider  the  asymmetric  co-occurrence  matrix  defined 
in  Section  2.1  for  joint  relative  entropy. 


3.3.  Co-occurrence  matrix  of  a  thresholded 
bilevel  image 

Let  us  assume  that  t  is  the  selected  threshold.  By 
assigning  1  to  all  grey  levels  above  threshold  t,  G,  = 
{t-l-l,...,L— 1}  and  0  to  all  grey  levels  below  t,  Gj  = 
{0,  L- -..t},  we  obtain  a  binary  image.  It  should  be 
noted  that  the  grey  levels  in  G^  will  be  treated  equally 
likely  in  probability,  as  will  be  the  grey  levels  in  Gj. 
Consequently,  the  py  can  be  found  as  follows  (see 
Fig.  1): 


P'ij*\t)  =  9a(t)  = 


PaH) 

(f  +  1)  X  (f  +  1) 


for  0<i<u0<j  <t  (14) 


Py®*(t)  =  QbU)  ■- 


Psit) 


pr(t)  =  4c(t)  = 


Pi^it)  =  gait) 


(t-(-l)x(L-t-l) 
for  0  <  i  <  t,  t -I- 1  <7  <  L— 1  (15) 

Pc(t) 

(L-t-l)x(L-t-l) 
for  t  -I-  1  <  I  <  L—  1,  t  +  1  L—  1  (16) 
Pud) 


(L-t-l)x(f-Hl) 
for  t -I- 1  <  i  <  L— 1,0  <7  <  t.  (17) 


where  P,4(t),  PbU)<  PcU)>  and  Poit)  are  defined  by 
Equation  (3).  For  each  selected  t,  p'/fHO,  P'if\t),  p'if\t), 
and  p'i'P\t)  are  constants  in  each  individual  quadrant 
and  only  depend  upon  the  quadrant  to  which  they 
belong.  Therefore,  we  can  denote  them  by  qA{t),  gaUX 
gc(t),  and  goit),  respectively. 


3.4.  Relative  entropy-based  algorithm 
By  expanding  Equation  (13),  we  have: 

Mp;p')=  Z  Z  Pij^og^ 

1  =  0  j  =  0  Pij 

L~\  L-\  L-  1  L-  1 

=  Z  Z  Pu  log  P.7-  Z  Z  P.7  log  P'ij- 

I=0j  =  0  i  =  0  j=0 

(18) 

Because  the  first  term  in  Equation  ( 1 8)  is  independent 
of  the  threshold  t,  minimizing  the  relative  entropy 
described  by  Equation  ( 1 3)  is  equivalent  to  maximizing 
the  second  term  of  Equation  (18). 

We  can  simplify  even  further  the  second  term  of  the 
right  side  of  the  Equation  (18)  as  follows: 

L-  1  L-  1 

Z  Z  Pi7logPu=  ZPi7log9a(0  + ZPijlog9B(0 

1  =  0  j  =  0  A  B 

+  ZP.7log^lc(l)  +  ZPijlog9D(t) 

C  D 

=  PaWoSQaU)  +  PaWoggsit) 

-b  Pc(t)\oggc{t)  -H  PoWoggolt). 

(19) 

This  implies  that  in  order  to  obtain  a  desirable 
threshold  for  classifying  the  object  from  the  background, 
we  need  only  maximize  the  last  expression  in  Equation 
(19)  over  t.  The  expression  consists  of  four  terms  only, 
each  of  which  is  a  product  of  P,  and  logqdO  for  ‘  =  A, 
B,  C,  D.  In  comparison  with  Equations  (8)  and  (9) 
required  for  the  local  entropy  and  Equation  (11)  for 
the  joint  entropy,  the  computational  load  for  the  relative 
entropy  is  significantly  reduced.  From  Equations  (8) 
and  (9),  (t  -I- 1)^  -I-  (L—  t)^  multiplications  are  required 
to  calculate  py  log  Py  for  finding  the  local  entropy,  and 
from  Equation  (1 1),  2(t  -I-  1)(L—  t)  multiplications  for 
the  joint  entropy.  However,  only  four  multiplications 
and  four  divisions  are  needed  in  Equations  (19)  for  the 
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relative  entropy.  As  a  result,  the  computational  saving 
can  be  tremendous  when  the  size  of  an  image  is  very 
large. 

4.  EXPERIMENTAL  RESULTS 

In  order  to  see  the  performance  of  the  relative  entropy- 
based  thresholding  method  we  conducted  tests  for  a 
set  of  various  images.  As  shown  in  experiments,  the 
relative  entropy  approach  provides  an  alternative  effi¬ 
cient  and  effective  image  thresholding  tool.  All  test 
images  have  256  grey  levels.  In  all  experiments,  the 
images  labelled  (a)  are  original  images;  the  images 
labelled  (b),  (c)  and  (d)  are  generated  by  the  local 
entropy,  joint  entropy  and  relative  entropy,  respectively. 
All  the  figures  labelled  (e)  represent  the  corresponding 
grey-level  histograms  of  the  original  images. 

Experiment  1:  Peppers  image.  Fig.  2(a) 

From  Fig.  2(b-d),  it  is  obvious  that  the  Joint  entropy 
produced  the  worst  image  with  threshold  t  =  90,  while 
the  local  entropy  and  relative  entropy  produced  an 
identical  image  since  both  generated  the  same  threshold 
t=  127. 

Experiment  2:  f-16  jet  image,  Fig.  3(a) 

In  this  image,  three  methods  generated  different 
details,  shown  in  Fig.  3(b-d).  For  instance,  the  local 
entropy  with  threshold  t  =  1 1 5  gave  the  best  description 
of  the  lettering  “F-16”  on  the  tail,  while  the  relative 
entropy  with  threshold  r  =  175  shows  more  clearly  the 
cockpit,  the  insignia,  and  the  lettering  “US  AIR  FORCE” 
on  the  fuselage.  The  joint  entropy  with  f  =  137  produced 
an  image  between  the  quality  of  the  other  methods. 

Experiment  3:  Couple  image.  Fig.  4(a) 

Evidently,  the  couple  image  thresholded  by  the  rela¬ 
tive  entropy  produced  the  best  image,  Fig.  4(d),  com¬ 
pared  to  those  in  Figs  4(b)  and  (c)  generated  by  the 
local  entropy  and  joint  entropy.  The  threshold  used 
for  the  relative  entropy  was  1 1 1,  whereas  both  the  local 
entropy  and  joint  entropy  used  the  same  threshold 
171. 

Experiment  4:  Building  image.  Fig.  5(a) 

The  building  image  is  interesting.  Both  the  local 
entropy  and  joint  entropy  generated  close  thresholds 
t  =  166  and  t  =  172,  respectively.  As  a  result,  the  cor¬ 
responding  thresholded  images.  Figs  5(b)  and  (c)  are 
close.  However,  Fig.  5(d)  produced  by  the  relative 
entropy  using  threshold  t  =  237  is  quite  different  from 
Figs  5(b)  and  (c).  The  local  entropy  and  joint  entropy 
seem  to  give  a  better  description  of  the  building  while 
failing  to  pick  up  the  middle  edges  of  the  building  and 
the  outside  stairs,  which  are  shown  in  Fig.  5(d).  The 
reason  for  this  is  probably  that  the  relative  entropy  can 
best  match  all  possible  transitions  made  from  one  grey 
level  to  another.  This  seems  to  be  justified  from  the 


grey-level  histogram  of  the  building  image  given  by 
Fig.  5(e).  It  should  be  noted  that  the  histogram  of 
Fig.  5(e)  is  very  different  from  that  of  previous  images. 
Figs  2(e),  3(e)  and  4(e). 

Experiment  5:  Coffee  cup  image.  Fig.  6(a) 

Compared  to  the  grey-level  histogram  of  the  building 
image.  Fig.  5(e),  the  coffee  cup  image  has  a  very  similar 
grey-level  histogram  distribution.  Fig.  6(e).  Coinciden¬ 
tally,  the  relative  entropy  produced  the  same  threshold 
t  —  237  which  was  used  for  the  building  image.  The 
local  entropy  and  joint  entropy  produced  Fig.  6(b) 
and  (c)  with  t  =  1 30  and  t  =  156,  respectively.  As  shown 
in  Fig.  6(b)-(d),  Fig.  6(d)  picks  up  the  open  edge  of  the 
cup  while  Fig.  6(b)  and  (c)  show  the  side  edges  of  the 
cup. 

Experiment  6:  Vase  image.  Fig.  7(a) 

The  grey-level  histogram  of  the  vase  image  is  very 
different  from  that  of  other  images,  where  its  grey 
levels  are  distributed  more  uniformly  than  others.  The 
image  Fig.  7(c)  produced  by  I  =  163  for  the  joint  entropy 
does  not  seem  as  good  as  Fig.  7(b)  and  (d)  with  t  =  125 
for  the  local  entropy  and  t  =  1 32  for  the  relative  entropy, 
respectively.  In  addition.  Fig.  (d)  looks  a  little  better 
than  Fig.  7(b),  since  there  is  blurring  over  the  top  of 
the  vase  in  Fig.  7(b). 

Experiment  7:  Lena  image.  Fig.  8(a) 

Figure  8(b)-(d)  shows  that  the  quality  of  images 
produced  by  t  =  1 59  for  the  local  entropy,  t  =  1 24  for 
the  joint  entropy  and  t=  170  for  the  relative  entropy 
is  nearly  the  same  except  that  they  pick  up  different 
tiny  descriptions  and  details.  For  instance,  relative 
entropy  shows  Lena’s  mouth  at  the  expense  of  some 
details  of  the  feather  on  Lena’s  hat.  In  contrast,  the 
local  entropy  and  the  joint  entropy  give  a  little  more 
detail  on  the  feather  while  missing  Lena’s  mouth  and 
some  details  of  Lena’s  hat. 

Experiment  8:  City  image.  Fig.  9(a) 

Like  the  Lena  image,  the  city  images  produced  by 
t=  123  for  the  local  entropy,  t=  128  for  the  joint 
entropy  and  t  =  112  for  the  relative  entropy  are  very 
close.  It  is  interesting  to  compare  the  grey-level  histo¬ 
grams  of  these  two  images.  If  one  of  them  is  flipped 
over,  it  is  found  that  their  distributions  turned  out  very 
similar.  Therefore,  these  two  experiments  should  be 
expected  to  have  similar  results. 

Based  on  the  experiments  conducted  above  it  seems 
that  the  grey-level  histograms  of  these  eight  images  can 
be  roughly  classified  into  four  categories.  The  first 
three  experiments  (peppers,  F-16  jet  and  eouple  images) 
are  grouped  together  into  Category  1  since  their  histo¬ 
grams  have  many  saw-like  sharp  peaks  with  short 
durations.  The  next  two  experiments  (building  and  cup 
images)  are  in  Category  2  because  they  share  the  same 
characteristics  of  the  histograms  which  are  Gaussian- 
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(e)  grey-level  histogram 

Fig.  4.  Image  of  couple;  (a)  original  image,  (b)  image  obtained  by  local  entropy,  t  =  171,  (c)  image  obtained 
by  joint  entropy,  t  =  171,  (d)  image  obtained  by  relative  entropy,  t  =  1 1 1,  (e)  grey-level  histogram. 


(c)  joint  entropy  (t=172) 


(d)  relative  entropy  (t=237) 


(e)  grey-level  histogram 

Fig.  5.  Building  image:  (a)  original  image,  (b)  image  obtained  by  local  entropy,  (  =  166,  (c)  image  obtained 
by  joint  entropy,  t  =  172,  (d)  image  obtained  by  relative  entropy,  l  =  237,  (e)  grey-level  histogram. 


(c)  joint  entropy  (t=156) 


(d)  relative  entropy  (t=237) 


(e)  Grey-level  histogram 

Fig.  6.  Coffee  cup  image;  (a)  original  image,  (b)  image  obtained  by  local  entropy,  t  =  130,  (c)  image  obtained 
by  joint  entropy,  t  =  156,  (d)  image  obtained  by  relative  entropy,  t  =  237,  (e)  grey-level  histogram. 
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(c)  joint  entropy  (t=163)  (d)  relative  entropy  (t=132) 


(e)  grey-level  histogram 

Fig.  7.  Vase  image:  (a)  original  image,  (b)  image  obtained  by  local  entropy,  t  =  125,  (c)  image  obtained  by 
joint  entropy,  t  =  163,  (d)  image  obtained  by  relative  entropy,  t  =  132,  (e)  grey-level  histogram. 


1286 


C.-I  Chang  et  al. 


(c)  joint  entropy  (t=124)  (d)  relative  entropy  (t=170) 


(e)  grey-level  histogram 

Fig.  8.  Lena  image;  (a)  original  image,  (b)  image  obtained  by  local  entropy,  t  =  1 59,  (c)  image  obtained  by 
joint  entropy,  t  =  124,  (d)  image  obtained  by  relative  entropy,  t  =  170,  (e)  grey-level  histogram. 
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(a)  original  image 


(c)  joint  entropy  (t=128) 


(d)  relative  entropy  (t=l  12) 


(e)  grey-level  histogram 

Fig.  9.  City  image;  (a)  original  image,  (b)  image  obtained  by  local  entropy,  t  =  123,  (c)  image  obtained  by 
joint  entropy,  t  =  128,  (d)  image  obtained  by  relative  entropy,  t  =  112,  (e)  grey-level  histogram. 
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Table  1.  Images  versus  thresholds  for  three  methods 


Images 

Min 

(grey  level) 

Max 

(grey  level) 

Local 

entropy 

Joint 

entropy 

Relative 

entropy 

1.  Peppers 

0 

242 

127 

90 

127 

2.  F-16  jet 

13 

255 

115 

137 

175 

3.  Couple 

70 

255 

171 

171 

111 

4.  Building 

40 

255 

166 

172 

237 

5.  Coffee  cup 

78 

255 

130 

156 

237 

6.  Vase 

0 

255 

125 

163 

132 

7.  Lena 

57 

255 

159 

124 

170 

8.  City 

34 

219 

123 

128 

112 

like  with  three  sharp  peaks.  The  last  two  experiments 
(Lena  and  city  image)  are  in  Category  3  because  of 
their  very  similar  mountain-like  histograms.  The  histo¬ 
gram  of  the  vase  image  is  completely  different  from 
those  of  all  previous  images  and  stands  itself  alone  to 
form  Category  4  because  its  histogram  is  more  or  less 
uniformly  distributed.  Table  1  summarizes  the  results 
for  images  versus  thresholds.  As  shown  in  the  table, 
the  thresholds  are  image  dependent.  Although  more 
experiments  need  to  be  performed,  our  experiments 
show  that  the  local  entropy  and  relative  entropy  perform 
better  than  does  the  joint  entropy  in  most  cases,  and 
the  relative  entropy  can  compete  with  the  local  entropy. 
It  is  also  shown  from  experiments  that  the  relative 
entropy  approach  can  complement  the  local  entropy 
and  joint  entropy  approaches  in  terms  of  providing 
different  details  which  were  missed  by  the  local  entropy 
and  joint  entropy.  This  is  particularly  true  for  the  F-16 
jet,  building  and  cup  images. 

5.  CONCLUSION 

A  new  thresholding  method  based  on  the  relative 
entropy  concept  is  presented  in  this  paper.  The  idea  is 
to  find  a  threshold  which  minimizes  the  mismatching 
between  two  transition  probability  distributions 
resulting  from  the  co-occurrence  matrices  of  an  image 
and  a  thresholded  image.  The  proposed  approach  is 
different  from  the  local  entropy  and  joint  entropy 
methods  suggested  by  N.  R.  Pal  and  S.  K.  Pal.*'*  In 
order  to  demonstrate  the  performance  of  the  relative 
entropy  approach,  several  images  are  studied  in  com¬ 
parison  to  the  local  and  joint  entropy  approaches.  The 
experimental  results  show  that  the  relative  entropy- 
based  method  is  a  good  alternative  to  the  local  and 
joint  entropy  methods.  Particularly  interesting  is  that 
the  relative  entropy  approach  can  complement  the 
local  entropy  and  joint  entropy  methods,  and  it 
demonstrates  a  good  capability  for  picking  up  the 
edges  of  objects.  In  addition,  the  computational  com¬ 
plexity  of  calculating  entropy  required  for  the  relative 
entropy  method  is  far  less  than  required  for  the  local 
entropy  and  joint  entropy  algorithms. 

6.  SUMMARY 

Image  thresholding  using  information  theoretic  ap¬ 
proaches  based  on  Shannon’s  entropy  concept  has 


received  considerable  interest  in  recent  years.  Of  parti¬ 
cular  interest  are  two  methods  proposed  by  N.  R.  Pal 
and  S.  K.  Pal  which  use  a  co-occurrence  matrix  to 
define  second-order  local  and  joint  entropies.  The  co¬ 
occurrence  matrix  is  a  transition  matrix  generated  by 
changes  in  pixel  intensities.  For  any  two  arbitrary  grey 
levels  i  and  j  (i,  j  are  not  necessarily  distinct),  the  co¬ 
occurrence  matrix  describes  all  intensity  transitions 
from  grey  level  i  to  grey  level  j.  Suppose  that  t  is  the 
desired  threshold.  The  t  then  segments  an  image  into 
the  background  which  contains  pixels  with  grey  levels 
below  or  equal  to  t  and  the  foreground  which  cor¬ 
responds  to  objects  having  pixels  with  grey  levels  above 
t.  This  t  further  divides  the  co-occurrence  matrix  into 
four  quadrants  which  correspond  to  transitions  from 
background  to  background  (BB),  background  to  ob¬ 
jects  (BO),  objects  to  background  (OB)  and  objects  to 
objects  (OO).  The  local  entropy  is  defined  only  on  two 
quadrants,  BB  and  OO,  whereas  the  joint  entropy  is 
defined  only  on  the  other  two  quadrants,  BO  and  OB. 
Based  on  these  two  definitions.  Pal  and  Pal  developed 
two  algorithms,  one  which  maximizes  local  entropy, 
and  the  other  which  maximizes  joint  entropy. 

In  this  paper,  we  present  an  alternative  entropy- 
based  approach  which  is  different  from  previous  ap¬ 
proaches.  Rather  than  looking  into  entropies  of  back¬ 
ground  or  object  individually,  we  introduce  the  concept 
of  the  relative  entropy  (also  known  as  cross  entropy, 
Kullback-Leiber’s  discrimination  distance  and  directed 
divergence)  that  has  been  widely  used  in  source  coding 
for  the  purpose  of  measuring  the  mismatching  between 
two  sources.  Since  a  source  is  generally  characterized 
by  a  probability  distribution,  the  relative  entropy  can 
be  also  interpreted  as  a  distance  measure  between  two 
sources.  This  suggests  that  the  relative  entropy  can  be 
used  for  a  criterion  to  measure  the  mismatching  between 
an  image  and  a  thresholded  bilevel  image.  One  method 
to  apply  the  relative  entropy  concept  to  image  threshold¬ 
ing  is  to  calculate  the  grey-level  transition  probability 
distributions  of  the  co-occurrence  matrices  for  an  image 
and  a  thresholded  bilevel  image,  then  find  a  threshold 
which  minimizes  the  discrepancy  between  these  two 
transition  probability  distributions,  i.e.  their  relative 
entropy.  The  smaller  the  discrepancy,  the  better  the 
matching  between  the  original  image  and  the  thres¬ 
holded  image.  The  threshold  rendering  the  smallest 
relative  entropy  will  be  selected  to  segment  the  image. 
As  a  result,  the  thresholded  bilevel  image  will  be  the 
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best  approximation  to  the  original  image.  Since  transi¬ 
tions  of  OB  and  BO  generally  represent  edge  changes 
in  boundaries  and  transitions  of  BB  and  OO  indicate 
local  changes  in  regions,  we  can  anticipate  that  a 
thresholded  bilevel  image  produced  by  the  proposed 
relative  entropy  approach  will  best  match  the  co¬ 
occurrence  matrix  of  the  original  image.  This  observa¬ 
tion  is  demonstrated  experimentally  by  several  test 
images.  Although  there  is  no  evidence  of  indication 
that  one  is  generally  better  than  the  others,  according 
to  the  experiments  conducted  in  this  paper,  the  joint 
entropy  did  not  work  as  well  as  did  the  local  entropy 
and  relative  entropy.  Interestingly,  among  all  images 
tested  the  relative  entropy  approach  seems  to  perform 
better  than  the  others  in  finding  edges.  In  addition,  our 
experiments  show  that  the  relative  entropy  seems  to 
be  a  good  complement  to  the  local  entropy  and  joint 
entropy  methods  in  terms  of  providing  different  image 
details  and  descriptions  from  those  provided  by  the 
local  entropy  and  joint  entropy.  Finally,  an  advantage 
of  the  relative  entropy  approach  is  the  computational 
saving  compared  to  the  local  and  joint  entropy  ap¬ 
proaches  based  on  arithmetic  operations  required  for 
calculating  entropies. 
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